Tutorial for benchmarking #2499

jainapurva · 2025-07-07T14:48:57Z

This pull request introduces comprehensive documentation updates for the TorchAO benchmarking framework

docs/source/benchmarking_overview.md: Added a detailed tutorial on using the TorchAO benchmarking framework. It includes steps for adding APIs, model architectures, and CI dashboard integration, along with troubleshooting tips and best practices.
docs/source/benchmarking_user_faq.md: Introduced a new FAQ section to address common benchmarking use cases, with a placeholder for future content.

pytorch-bot · 2025-07-07T14:49:00Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2499

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copilot

Pull Request Overview

This PR introduces a new microbenchmarking tutorial for the TorchAO framework and integrates it into the documentation index.

Adds a comprehensive microbenchmarking.rst tutorial covering API/model integration, local benchmarking, and CI dashboard setup.
Updates index.rst to include the new microbenchmarking tutorial in the toctree.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
docs/source/microbenchmarking.rst	New tutorial for using the TorchAO benchmarking framework
docs/source/index.rst	Added `microbenchmarking` entry to the documentation index

Comments suppressed due to low confidence (3)

docs/source/microbenchmarking.rst:6

The :ref: links to sections are missing corresponding label targets. Please add explicit rst labels (e.g. .. _add-api-to-benchmarking-recipes:) before each section heading so these references resolve correctly.

1. :ref:`Add an API to benchmarking recipes`

docs/source/microbenchmarking.rst:148

[nitpick] Grammar suggestion: change to "The output generated after running the benchmarking script is in the form of a CSV file." to improve readability and accuracy.

The output generated after running the benchmarking script, is the form of a csv. It'll contain the following:

docs/source/microbenchmarking.rst:38

There's a typo: “it-width” should be “bit-width” and consider rephrasing "appended to the string config in input" to improve clarity.

  If the ``AOBaseConfig`` uses input parameters, like bit-width, group-size etc, you can pass them appended to the string config in input

Copilot · 2025-07-07T23:56:16Z

docs/source/index.rst

   quantization
   sparsity
   contributor_guide
+   microbenchmarking


[nitpick] Consider alphabetizing the toctree entries (e.g., contributor_guide, microbenchmarking, quantization, sparsity) to keep navigation consistent.

Suggested change

quantization

sparsity

contributor_guide

microbenchmarking

contributor_guide

microbenchmarking

quantization

sparsity

jerryzh168 · 2025-07-08T00:37:12Z

docs/source/microbenchmarking.rst

+  If the ``AOBaseConfig`` uses input parameters, like bit-width, group-size etc, you can pass them appended to the string config in input
+  For example, for ``GemliteUIntXWeightOnlyConfig`` we can pass it-width and group-size as ``gemlitewo-<bit_width>-<group_size>``
+
+2. Add a Model to Benchmarking Recipes


for this one can people add huggingface models easily?

By adding model, if we mean adding architecture, or custom shapes for architectures, then they can. This is a microbenchmarking recipe, hence to add a model, we'll need to add it's full architecture (like llama), for generating lower level benchmarking numbers. For micro-benchmarking, it doesn't support specifying a hf-model name and importing it. That functionality can be included in future developments.

jerryzh168 · 2025-07-08T00:50:06Z

docs/source/microbenchmarking.rst

+
+This tutorial will guide you through using the TorchAO microbenchmarking framework. The tutorial contains different use cases for benchmarking your API and integrating with the dashboard.
+
+1. :ref:`Add an API to benchmarking recipes`


nit: I think the order should be from most used to least used, also the doc feels more like a developer facing "API reference" which is centers around how to develop the benchmarking tool further (add new things etc.), instead of "using" the benchmarking tool.

Wondering if this could be structured in a way that is more user centered, that is structure this in terms of use cases / user flows:

what is the most common use case, and what should be the flow / interaction with the tool?

and introduce what people need to do in different use cases.

one use case I have, is that I added a new tensor, e.g. for fbgemm, I'd like to know how fbgemm quant compare against existing quant on existing microbenchmarks or some models

e.g. how can the benchmarking tool simplify the changes needed to generate these tables: #2273 and #2276

Another could be for kernel developers who is writing cutlass or triton kernels, what are the pieces they need to interact with.

And we can then optimize / simplify each flow after we have that.

But maybe I'm talking about a different doc, this doc can be useful as a "API reference" as well

Thanks @jerryzh168, this is a very helpful suggestion. There should be two docs, one for API reference, another should be for different use-cases, something like an FAQ or help doc

drisspg

One minor overall nit is that I think we should write docs in markdown as often as possible since it is much more of a lingua franca

jerryzh168 · 2025-07-09T20:52:06Z

docs/source/benchmarking_user_faq.md

+2. On the right sidebar, find the "Labels" section.
+3. Click on the "Labels" dropdown and select "ciflow/benchmark" from the list of available labels.
+
+Adding this label will automatically trigger the benchmarking CI workflow for your pull request.


can you add where can we see the results as well

also will it run after each new commit is added or if the commit is updated?

If we add label, it'll run for every commit we add to the PR

jerryzh168 · 2025-07-09T20:56:09Z

docs/source/benchmarking_user_faq.md

+4. In the dropdown menu, select the branch.
+5. Click the "Run workflow" button to start the benchmarking process.
+
+This will execute the benchmarking workflow on the specified branch, allowing you to evaluate the performance of your changes.


what happens when people:

(1). push a new commit
(2). update a existing commit

?

also when do we use (1) and when do we use (2)? seems like they are doing the same thing, if so maybe just keep one is enough

This will run only for the latest change on the branch, it won't trigger automatically on every commit, for that we'll need to add label to the PR

Addressed in PR: #2512

jerryzh168 · 2025-07-09T20:58:36Z

docs/source/benchmarking_user_faq.md

+### Interpreting Results
+
+The benchmark results include:
+
+- **Speedup**: Performance improvement compared to baseline (bfloat16)
+- **Memory Usage**: Peak memory consumption during inference
+- **Latency**: Time taken for inference operations
+- **Profiling Data**: Detailed performance traces (when enabled)


is there some duplicationg between these and L80-L84

jerryzh168 · 2025-07-09T20:58:59Z

docs/source/benchmarking_user_faq.md

+- **Latency**: Time taken for inference operations
+- **Profiling Data**: Detailed performance traces (when enabled)
+
+Results are saved in CSV format with columns for:


nit: I think you can just show a small example output here

jerryzh168 · 2025-07-09T21:02:20Z

docs/source/benchmarking_user_faq.md

+
+This guide is intended to provide instructions for the most fequent benchmarking use-case. If you have any use-case that is not answered here, please create an issue here: [TorchAO Issues](https://github.com/pytorch/ao/issues)
+
+## Table of Contents


this is closer to use case but still not really end use cases yet I feel, I think it might be helpful to describe scenarios like:
(1) integrating new quantization techniques (e.g. a new float8 technique)
people might be interested to understand overall performance, accuracy, memory footprint impact of the new technique and how it compared to the existing ones
In terms of commits, only last commit is important here I think

(2) kernel optimizations
people might be interested only on performance, and maybe trace is also important
they may put up a PR and will go through multiple commits, or just update a single commit multiple times, and they want to understand the performance differences between different commits

(3) performance regression tracking
what is the entry point for this one? will people receive an email when some threshold is passed?

(4) end users
is there a dashboard we can show end users for them to understand if they are using torchao, what speedup and accuracy drop they can expect on certain technique, device, for certain model architecture, etc.

Will be addressed in PR #2512

jainapurva · 2025-07-09T23:07:24Z

@jerryzh168 As discussed offline, I'll address your comments in the follow-up PR for the end-user tutorial #2512

jerryzh168 · 2025-07-09T23:17:29Z

docs/source/benchmarking_overview.md

@@ -0,0 +1,215 @@
+# Benchmarking Overview


also API Guide might be more accurate

A dummy tutorial structure

e7b20cc

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 7, 2025

jainapurva added 4 commits July 7, 2025 09:56

update tutorial

a6a2ae0

update tutorial

cde732c

update tutorial

9f994f1

updates

ed6f659

jainapurva requested review from Copilot, jerryzh168, drisspg and andrewor14 July 7, 2025 23:55

jainapurva marked this pull request as ready for review July 7, 2025 23:56

jainapurva requested a review from danielvegamyhre July 7, 2025 23:56

Copilot AI reviewed Jul 7, 2025

View reviewed changes

jainapurva added topic: documentation Use this tag if this PR adds or improves documentation topic: for developers Use this tag if this PR is mainly developer facing labels Jul 7, 2025

jerryzh168 reviewed Jul 8, 2025

View reviewed changes

drisspg reviewed Jul 8, 2025

View reviewed changes

jainapurva added 3 commits July 9, 2025 11:32

update tutorial to .md

41b9986

update ondex.rst and tutorials

b8564ca

fix formatting

98eef30

jerryzh168 reviewed Jul 9, 2025

View reviewed changes

Remove second tutorial

8c05b7f

jerryzh168 reviewed Jul 9, 2025

View reviewed changes

jerryzh168 approved these changes Jul 9, 2025

View reviewed changes

jainapurva added 4 commits July 9, 2025 16:28

Update tutorial names

17a11a0

Update tutorial names

7df5b33

Update tutorial names

24eeb0f

Fix formatting

5c94cee

jainapurva merged commit 64c1ce3 into main Jul 10, 2025
19 of 21 checks passed


		This tutorial will guide you through using the TorchAO microbenchmarking framework. The tutorial contains different use cases for benchmarking your API and integrating with the dashboard.

		1. :ref:`Add an API to benchmarking recipes`


		This guide is intended to provide instructions for the most fequent benchmarking use-case. If you have any use-case that is not answered here, please create an issue here: [TorchAO Issues](https://github.com/pytorch/ao/issues)

		## Table of Contents

Tutorial for benchmarking #2499

Tutorial for benchmarking #2499

Uh oh!

Conversation

jainapurva commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2499

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Jul 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

drisspg left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jainapurva Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jainapurva commented Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jainapurva commented Jul 7, 2025 •

edited

Loading

pytorch-bot bot commented Jul 7, 2025 •

edited

Loading

jerryzh168 Jul 8, 2025 •

edited

Loading

jainapurva Jul 9, 2025 •

edited

Loading

jerryzh168 Jul 9, 2025 •

edited

Loading

jerryzh168 Jul 9, 2025 •

edited

Loading

jainapurva commented Jul 9, 2025 •

edited

Loading